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ABSTRACT 

A  simple  physical  model  of  residential  energy  consumption  provides  the 
framework  for  an  exploration  of  segmented  regression  models  fit  by  least 
squares.  The  energy  model  is  a  generalization  of  a  linear,  single  change- 
point  model  such  as  that  considered  by  Rinkley  (1971). 

Some  simple  geometric  measures  of  nonlinearity  and  nondifferentiability 
are  proposed.  These  measures  are  related  to  the  construction  of  approximate 
confidence  regions  for  the  parameters  of  a  general  segmented  model.  In 
addition,  the  relation  shown  between  these  measures  and  those  proposed  by 
Bates  and  Watts  (1980)  may  be  useful  in  analyzing  continuously  differentiable 
models. 
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SIGNIFICANCE  AND  EXPLANATION 


Simple  procedures  are  presented  for  assessing  the  severity  of  non¬ 
linearity  in  a  regression  model  involving  a  function  which  is  nondifferen- 
tiable  with  respect  to  the  unknown  parameters*  The  nonlinearity  measures 
proposed  indicate  the  validity  of  standard  approximations  which  may  be  used  to 
determine  the  accuracy  of  parameter  estimates*  The  proposed  measures  are 
related  to  existing  measures  of  nonlinearity ,  but  can  be  applied  to  a  broader 
class  of  models,  and  in  some  cases  may  be  easier  to  calculate. 

The  methods  developed  are  motivated  and  illustrated  by  a  simple  model  of 
residential  energy  consumption.  This  model  has  been  the  basis  for  measure¬ 
ments  of  energy  conservation  in  several  studies. 


Accession  For 

NT IS  GRA&I 
DTIC  TAB 
Unannounced 
Justification. 


By. 


distribution/ 

Availability  Codes 

Avail  and/or 

1 

I 


The  responsibility  for  the  wording  and  views  expressed  in  this  descriptive 
sunnary  lies  with  MRC,  and  not  with  the  author  of  this  report. 


Table  of  Contents 


1 .  Introduction 

2.  The  Energy  Model 

3.  The  Geometry  of  Nonlinear  Least  Squares 

4.  Nonlinearity  in  the  Energy  Model 

5.  Impact  of  Nonlinearities  on  Approximate  Confidence  Regions 

6.  Visual  Indicators  of  Nonlinearity 

7.  Quantifying  Nonlinearity 

8.  Effective  Curvatures  for  Segmented  Models 

9.  Effective  Curvature  of  the  Energy  Model 

10.  Smoothing  the  Model  function 

1 1 .  Conclusion 
Appendix  A 


References 


MEASURES  (V  NONLINEARITY 
FOR  SEGMENTED  REGRESSION  MODELS 


Mirim  L.  Goldberg 

1.  Introduction . 

The  object  of  this  paper  is  to  develop  the  geometry  of  nondifferentiable 
least  squares  problems/  and  within  that  framework  to  indicate  same  simple 
procedures  for  assessing  the  effects  of  nonlinearity*  Our  exploration  of 
piecewise  differentiable  regression  models  is  based  on  a  simple  model  which 
arises  in  the  context  of  residential  energy  analyses.  This  model  is  a 
generalization  of  a  linear  change-point  model* 

We  begin  by  describing  the  motivating  model.  After  reviewing  some  basic 
elements  of  the  geometry  of  nonlinear  regressions/  we  then  examine  the 
behavior  of  the  residual  sum  of  squares  function  for  the  energy  model,  and 
relate  this  function  to  approximate  confidence  intervals  for  the  unknown 
parameters.  Finally,  we  consider  some  measures  of  nonlinearity,  which 
indicate  the  validity  of  these  approximations,  and  which  are  appropriate  for 
nondifferentiable  models.  In  addition,  these  measures  may  be  useful  for 
certain  types  of  continuously  differentiable  models. 


Sponsored  by  the  United  States  Army  under  Contract  No.  DAAG2 9-80 -c-0 041. 

This  material  is  based  upon  work  supported  by  the  National  Science  Foundation 
under  Grant  No.  MCS-7927062,  Mod.  2,  and  by  joint  funding  from  the  Ford 
Foundation's  State  Environmental  Management  Program  and  the  New  Jersey 
Department  of  Energy. 


2.  The  Energy  Model 

A  simple  model  of  residential  energy  consumption  assumes  daily 
consumption  is  constant  at  the  baseload  level  a  as  long  as  the  average 
outdoor  temperature  T  is  above  a  reference  temperature  t,  and  increases  in 
proportion  to  t  -  T  for  T  <  t  .  with  Ya  representing  average  daily  fuel 
consumption  for  the  days  of  month  m,  our  model  is  expressed  formally  as 

Y  «  a  ♦  $  H  <t)  +  e  ,  (2.1) 

oi  n  oi 


where 


N 


VT>  ■  ir  £  i<t-j  ‘ 11  ' 


(2.2) 


I  is  the  indicator  function  and  e  is  a  random  disturbance.  The  variable 
Ha(T)  represents  the  average  daily  base-T  degree-days  for  the  month.  The 
temperature  t  is  interpreted  as  the  aaximun  outdoor  temperature  at  which  the 
furnace  is  required  to  heat  the  house,  and  0  as  the  house's  effective  heat 
loss  rate. 

Models  of  this  type  have  been  the  basis  for  analyses  of  energy 
consumption  patterns  in  a  large  number  of  gas  heated  houses,  and  in  a  smaller 
number  of  oil-  and  electrically-heated  houses.*  The  consumption  data  Ya  are 
derived  from  a  customer's  fuel  bills.  The  daily  temperature  data  T^,  in 
integer  degrees  Pareriheit,  are  obtained  from  a  nearby  U.S.  Weather  Bureau 
station  (National  Oceanic  and  Atmospheric  Administration,  monthly). 

Equation  (2.1)  has  also  been  applied  to  aggregate  data,**  with  Y_ 
representing  fuel  consumption  per  household  for  month  m.  For  utility-  or 
state-wide  aggregates,  a  different  definition  of  the  degree-day  variable 


* 


See,  e.g..  Pels  et  al  (1981),  Ifcitt,  Lavine  et  al  (1982),  and  Socolow  (1978) 

* 

See  Pels  and  Goldberg  (1982)  and  Goldberg  and  Pels  (1982). 


H  (T)  is  used,  to  account  for  the  la9  introduced  by  meters'  being  read  on 


different  days  throughout  the  month t 
N 
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H  (T) 


N  H  . 
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(2, 


I  <N  +1-j>  +  l  j 
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For  both  single-house  and  aggregate  analyses,  the  major  use  of  the  model 
defined  by  Equation  <2.1 )  is  in  determining  the  normalised  annual  consumption 
r.  the  index  T  is  given  by 

T  -  365  (a+0H  ( t ) )  (2.4) 

o 

where  H  (t)  is  the  long-term  (several-year)  average  of  daily  degree-days 
o 

base  t. 

Zf  consumption  data  were  available  on  a  daily,  rather  than  monthly, 
basis,  so  that  N  SI,  Equation  (2.1)  would  represent  a  simple  change-point 

flk 

regression  with  slope  zero  over  one  region.  Such  a  model  has  been  analysed  in 
detail  by  Hinkley  (1971).  in  addition  to  the  summation  in  Equation  (2.2)  or 
(2.3),  a  second  important  difference  between  the  energy  model  considered  here 
and  Hinkley* s  change-point  model  is  in  the  restriction  placed  on  the 
temperature  data  7aj'  ••  discussed  below  in  Section  5. 

For  the  energy  model,  we  will  consider  estimation  of  the  reference 
temperature  t,  baseload  a,  and  heating  rate  0  by  the  method  of  least 
squares.  Fleeing  the  model  in  a  more  general  context,  we  treat  it  as  a 
special  case  of  a  piecewise  differentiable  model.  We  will  explore  the 
behavior  of  such  models  in  the  framework  of  general  nonlinear  models. 
Naturally,  many  existing  results  for  simple  change-point  models  relate  closely 
to  this  problem.  We  will  continually  return  to  the  energy  model  defined  by 
Equation  (2.1)  for  illustration. 


Our  emphasis  is  on  methods  for  assessing  the  validity  of  approximate 
confidence  intervals  for  the  model  parameters.  Two  approximation  methods  are 
considered.  One  is  based  on  the  asymptotic  normality  of  the  least  squares 
estimates/  and  implicitly  on  a  linearisation  of  the  model  function.  The  other 
is  based  on  the  asymptotic  chi-squared  distribution  of  the  likelihood  ratio, 
and  uses  regions  bounded  by  contours  of  constant  Residual  Sum  of  Squares 
(RSS).  In  developing  methods  for  assessing  the  adequacy  of  these 
approximations,  we  will  rely  on  the  geometry  of  nonlinear  least  squares. 

3.  The  Geometry  of  Nonlinear  Least  Squares 

The  geometrical  approach  has  been  developed  extensively  for  continuously 

differentiable  models,  and  will  be  applied  here  to  the  general  piecewise 

differentiable  model.  The  general  nonlinear  model  with  unknown  p-dimensional 

parameter  0  can  be  written  in  matrix  form  as 

Y  -  h<0)  +  e  (3.1) 

E(e)  -  0  (3.2) 

E(e’e)  -  o2  i  .  (3.3) 

Here,  Y,  n,  and  e  are  n-dimensional  vectors,  such  that  n  ,  the  mth 

n 

component  of  n,  depends  on  observations  xB  as  well  as  on  6.  We  further 
assume  that  the  random  disturbance  e  has  a  Gaussian  distribution.  Note  that 
e  enters  the  model  linearly;  the  nonlinearity  is  only  in  the  model  function  n 
For  the  energy  model  given  by  Equation  (2.1),  8  ■  [a,0,T,]'  (with  the 

apostrophe  denoting  the  transpose)  and 

n(8)  «  al  +  0H(t),  (3.4) 


where  H  is  the  n-dimensional  vector  with  components  given  by  Equation 

(2.2)  or  Equation  (2.3)  and  1  signifies  an  n-dimensional  vector  of  ones. 
The  observations  Xg  are  vectors  of  daily  temperatures  Tmj* 


For  the  general  model ,  as  0  ranges  over  the  parameter  space  6,  the 
function  n(8)  sweeps  out  a  p-dimensional  surface  or  "solution  locus”  L  in 
the  n-dimensional  sample  space: 

L  -  (h<0)  :  0  e  0}. 

•  •  • 

He  define  the  derivative  vector  n  and  the  Hessian  matrix  n  as 

m 


(Throughout  this  paper  an  expression  in  square  brackets  indicates  a  matrix 
with  components  given  by  the  subscripted  expression,  such  that  m»1  , 2,  •  •  »  ,m 

if  «n,p«) 

A 

The  least  squares  estimate  0  is  the  solution  to  the  normal  equation 

h'(0)  <*  -  n(8>)  -  o. 

A 

That  is,  the  residual  vector  Y-n(0)  is  normal  to  the  tangent  plane  at 

A 

n(0),  the  tangent  plane  being  the  linear  span  of  the  column  vectors  which 

.  * 

make  up  the  matrix  o(0).  In  this  sense,  n(0>  is  the  projection  of 

Y  onto  the  solution  locus  L.  The  estimate  O  is  determined  by  pulling  back 

A  A 

the  projection  n <0 )  •  Y^  to  the  parameter  space  ®. 

To  quantify  the  severity  of  departures  from  linearity.  Bates  and  Watt s 
(1980)  propose  measuring  the  nonlinearity  of  a  model  in  terms  of  the  curvature 
of  the  solution  locus.  For  any  direction  v  in  the  parameter  space,  the 
tangent  t^  and  acceleration  vector  ay  at  8  are  defined  by 

t  ■  h(0)v  (3.5) 

v 

a  *  lv'0  (0)v]  ,  .  (3.6) 

V  m 

The  curvature  in  the  v  direction  is  then  defined  as 


The  relative  curvature  Yy  is  obtained  by  multiplying  Ky  by  the  standard 

/  2  2  2 

radius  /ps  ,  where  s  is  an  estimate  of  the  error  variance  o  .  In  the 
2 

present  work/  s  is  always  obtained  from  the  residual  sun  of  squares  RSS 

from  the  regression,  as  s  ■  RSS(0 )/(n-p) .  Decomposing  the  acceleration 

vector  into  components  a^  and  a^,  respectively  parallel  and  normal  to  the 

tangent  plane,  yields  analogous  definitions  for  tangential  and  normal 

curvatures  K*  and  K^,  and  relative  curvatures  Y*  end  Y^« 
v  v  v  v 

Noting  that  the  tangential  acceleration  component  is  caused  by  the 

parameterization  chosen,  while  the  normal  component  is  independent  of  this 

choice.  Bates  and  Watts  refer  to  the  parallel  as  the  "parameter-effects" 

curvature,  and  to  the  normal  ft  as  the  "intrinsic"  curvature.  That  is,  the 

v 

acceleration  component  normal  to  the  solution  locus  L  describes  the  bending 
of  the  p-dimensional  surface  L  in  n-dimensional  Euclidean  space.  The 
acceleration  component  parallel  to  the  tangent  plane  simply  reflects  the 
meandering  within  the  solution  locus  of  the  "lifted  line” 

n  *  (n(0+rv)  :  r  e  R} • 

V 

The  parameter-effects  curvature  can  in  principle  be  reduced  or  removed  by 

an  appropriate  reparameterization  (Bates  and  Watts,  1981).  By  contrast,  we 

may  consider  a  model  to  be  intrinsically  nonlinear  with  respect  to  the 

parameter  6  ^  if  the  normal  curvature  (or  acceleration)  in  the  direction  of 

2  2 

0j  is  nonzero.  This  is  equivalent  to  requiring  that  the  vector  3  n(0)/30^ 

•  • 

(composed  of  the  jth  diagonal  elements  of  the  matrices  n  (0))  does  not  lie  in 

in 

the  plane  spanned  by  the  coliauis  of  n(0). 


4.  Nonlinearity  In  the  Energy  Model. 

Many  of  the  problems  introduced  by  nondifferentiability  can  be  understood 
in  terms  of  the  general  geometrical  framework  just  described.  For  the  energy 
model  defined  by  Equation  (2.1),  the  model  function  n  is  given  by 
Equation  (3.4).  ftius. 


n(a,3,T) 


f  In  In  In  , 

1  3a  33  3t  j 

(1  |  H(t)  |  $F(T)]. 


(4.1) 


The  degree-day  derivative  F  is  obtained  by  dropping  the  terms  of  the  form 


( T  -  T  )  from  Equation  (2.2)  or  (2.3).  For  the  single  house ,  we  have 

mj 


N 


vx>  -r  1,  I<T.j  ‘ T)- 

m  j*1  J 


That  is,  F  is  (arbitrarily)  defined  to  be  right-continuous  at  discontinuity 
points  Tmj ,  which  occur  only  at  integer  Farenheit  degrees.  The  step- 
function  Fm  is  thus  the  empirical  distribution  function  of  outdoor 
temperatures  for  month  m,  and  H,,,  is  the  convolution  of  temperature 

with  F_. 


The  Hessian  n  for  Equation  (2.1)  is  given  by 
in 


n  (a, 3 ,t ) 


0 

0 

0 


0 

0 


F  (t ) 
n 


F  (T)  33F  /3t 

m  m 


(4.2) 


2  2 

The  only  nonzero  diagonal  element  is  3  n/3x  -  33f/3t,  which  is  a  delta 


function  with  spikes  at  discontinuity  points  of  F  (i.e.,  at  integers).  Thus, 
the  energy  model  is  intrinsically  linear  with  respect  to  a  and  3,  but 
Intrinsically  nonlinear  with  respect  to  t.  Between  any  two  successive 
integers,  however,  the  model  is  also  intrinsically  linear  in  t,  since 
33f/3t  "  0.  Hence,  the  solution  locus  £<  (and  in  this  sense  the  model)  is 


piecewise  planar.  However,  the  model  function  n 


is  nonlinear  in  3  as  well 


2 

as  in  t,  since  9  n/9(59T  is  nonzero.  Thus,  in  addition  to  whatever  intrinsic 
nonlinearity  results  from  the  discontinuities  in  f(t),  we  expect  to  find 
effects  of  nonlinearity  in  the  parameterization. 


5.  Impact  of  nonlinear ities  on  Approximate  Confidence  Regions 

Approximate  confidence  regions  for  a  q-dimensional  linear  combination 

A 

C0  based  on  the  asymptotic  normality  of  6  are  given  by  the  set  of  8 
satisfying 

(8-8)'C,{C,(r|,f|)"1c}'1C(e-0)  <  82  F*  .  (5.1) 

q,n-p 


In  Equation  (5.1),  n  is  the  derivative  evaluated  at  0,  x  denotes  a 

probability,  and  F*  the  1-x  quantile  of  the  F-distribution  with  q 

q,n-p 

and  n-p  degrees  of  freedom.  If  the  model  function  n  is  linear,  the  region 
defined  by  Equation  (5.1)  has  exact  confidence  level  1-x  even  in  finite 
samples.  The  small-sample  validity  of  such  confidence  regions  is  affected  by 
both  parameter-effect  and  intrinsic  nonlinearity. 

By  contrast,  the  sum-of- squares  based  regions  are  unaffected  by  parameter 
effects.  For  continuously  differentiable  q-dimensional  functions  g, 
such  a  region  is  the  set  of  g(8)  such  that 

A 

RSS (8 )  -  RSS(8)  n-p  <  px  (5 


RSS(8) 


q  q^n-p 


or 


RSS(0)  <  RSS(0)  (1+  -3“  F*  )  . 

n-p  q,n-p 


(5.3) 


* 


m 


m 


m 


m 


i 


m 
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Note  that  confidence  intervals  for  a  single  component  9^  are  obtained  from 

Equation  (5.3)  by  taking  q  *  1  and  g(9)  *  9^. 

The  region  defined  by  Equation  (5.3)  is  the  inverse  image  in  the 

parameter  space  0  of  the  intersection  of  the  solution  locus  L  with  a  sphere 

centered  at  Y  whose  squared  radius  is  RSS(9)(1  +  F  ).  If  the 

n-p  qfn-p 

solution  locus  is  flat,  the  region  determined  by  Equation  (5.3)  has  exact 
confidence  level  1 -it  in  finite  samples.  Thus,  the  small-sample  validity  of 
the  sum-of- squares  approximate  confidence  region  depends  on  how  sharply  the 
solution  locus  departs  from  the  approximating  tangent  plane  over  the  region  of 
interest. 

For  continuously  differentiable  models  the  use  of  confidence  regions 
based  on  Equation  (5.1)  or  (5.2)  is  well-established.  The  asymptotic  validity 
of  confidence  regions  defined  by  Equation  (5.1)  was  proved  by  Fisher  (1925), 
and  the  validity  of  regions  determined  by  Equation  (5.2)  by  Wilks  (1938). 

For  small  samples,  Beale  (1960)  proposed  an  inflation  factor  |i  for  the 
right-hand  side  of  Equation  (5.2)  which  yields  a  conservative  confidence 


region  for  the  case  q  *  p.  Beale's  factor  y  is  given  by 

y  -  1  +  *--&—■  n1  .  (5.4) 

(n-p)  p 

Bates  and  Watts  (1980)  showed  Beale's  nonlinearity  measure  N  to  be  equal  to 
one  quarter  the  mean  square  relative  intrinsic  curvature  Y  ,  and  showed  also 
that  the  factor  y  was  very  close  to  one  for  a  wide  variety  of  data  sets. 

For  cases  where  the  parameter-effects  curvature  is  slight,  Hamilton, 

Watts  and  Bates  (1982)  showed  how  to  approximate  the  sum-of-squares  region 
given  by  Equation  (5.2)  using  an  elliptical  region  similar  to  that  given  by 
Equation  (5.1),  but  with  a  correction  for  the  intrinsic  nonlinearity.  Bates 
and  Watts  (1981)  suggested  ways  of  choosing  parameter  transformations  to 


reduce  the  parameter-effects  curvature,  rendering  the  Gaussian  approximation 


regions  defined  by  Equation  (5.1)  more  accurate.  Bates  and  Watts  (1980) 

indicated  that  the  sum-of- squares  regions  may  be  considered  reliable  if  the 

intrinsic  curvature  K*-  is  small  compared  to  1/  2_  ir  ,  the  normal- 

pB  p,n-p 

based  regions  if  the  parameter-effects  curvature  K  is  also  small  compared 
to  this  quantity. 

In  using  the  approximate  confidence  regions  defined  by  Equation  (5.1)  or 
(5.2)  for  a  piecewise  differentiable  model/  we  have  two  main  concerns  of  a 
theoretical  nature.  The  first  is  to  establish  the  asymptotic  validity  of 
these  confidence  regions  for  our  non-regular  case.  The  second  is  to  find  ways 
of  assessing  the  severity  of  both  parameter-effects  and  intrinsic  nonlinearity 
for  nondifferentiable  models  with  finite  samples.  We  will  deal  briefly  with 
the  first  concern  before  proceeding  to  our  main  purpose/  the  development  of 
nonlinearity  measures  for  seepented  models. 

Hinkley  (1969)  proved  the  asymptotic  normality  of  least  squares  estimates 
for  the  simple  change-point  linear  regression.  His  methods  are  not  quite 
applicable  to  the  model  defined  by  Equation  (2.1),  because  the  observations 
(temperature  data)  for  this  model  are  taken  only  at  certain  fixed  points 
(integers),  while  Hinkley' s  proof  assumes  the  observations  may  come 
arbitrarily  close  to  the  change  point.  For  the  general  piecewise 
differentiable  model  with  discontinuities  in  the  derivative  at  fixed  points, 
the  present  author  (Goldberg,  1982)  has  shown  that  the  least  squares  estimates 
are  asystptotically  normal,  except  when  the  true  parameter  value  is  at  a  point 
of  discontinuity;  in  that  case,  the  normal  approximation  yields  asymptotically 
conservative  confidence  intervals. 

The  asymptotic  normality  justifies  the  use  of  sum-of-squares  contours  to 
define  likelihood  regions.  For  higher  confidence  levels  or  for  more  strongly 
skewed  RSS  functions,  the  likelihood  approach  should  be  more  accurate,  in  the 


We  turn  now  to  the  question  of  how  to  assess  the  severity  of  nonlinearity 
in  a  particular  small  sample.  We  begin  by  considering  some  useful  display 
techniques . 

One  way  to  study  the  effects  of  nonlinearity  in  the  energy  model  is  to 

examine  the  residual  sum  of  squares  RSS  as  a  function  of  the  "nonlinear" 

parameter  t.  if  our  model  function  h  were  linear  in  t,  then  RSS(t ) 

would  be  a  quadratic  function.  Instead,  we  expect  to  see  a  more  irregular 

dn 

function,  with  kinks  at  discontinuity  points  of  ■jjf,  that  is,  at  each  integer 
value  of  t . 

Figure  1  shows  a  plot  of  RSS  versus  the  change  point  t  for  a  typical 
data  set  fit  to  Equation  (2.1).*  Above  the  maximum  and  below  the  minimum 
observed  temperature  Tmj,  the  function  is  flat,  indicating  that  the  reference 
temperature  t  is  not  identifiable  if  it  falls  outside  the  range  of  the 
temperature  data.  In  the  region  of  low  t,  where  these  data  are  very  sparse, 
we  do  see  the  somewhat  jagged  behavior  anticipated.  For  similar  change  point 
models,  Hudson  (1966)  and  Hinkley  (1971)  have  also  shown  RSS  curves  of  this 
general  shape,  but  with  a  more  pronounced  scalloped  appearance. 

Overall,  and  especially  in  the  neighborhood  of  the  minimum  (i.e.,  in  the 

A 

neighborhood  of  the  least  squares  estimate  t)  the  RSS  function  looks  fairly 


Figure  1:  Residual  Sum  of  Squares  RSS(t) 
Versus  Reference  Temperature  t 


RSS  (T) 
((Th/cu-d)2) 


smooth,  offering  some  hope  that  procedures  which  have  been  developed  for 
continuously  differentiable  models  may  still  be  useful  in  the  present 
application. 

In  particular,  in  addition  to  the  inference  procedures  which  are  the 
focus  of  the  present  work,  fitting  procedures  for  smooth  models  can  be 
extended  to  the  model  defined  by  Equation  (2.1).  The  fitting  procedure  used 
in  this  study,  discussed  by  Dutt,  Eels  et  al  (1982)  and  in  more  detail  by  the 
present  author  (Goldberg,  1982),  is  based  on  Newton's  method.  This  procedure 
represents  a  modification  of  a  method  described  by  Hinkley  (1969)  for  simple 
change-point  models ,  and  in  most  cases  is  more  efficient. 

Figure  2  shows  three  residual-sum-of-squares  curves.  The  first  is  the 
original  curve  RSS(t).  The  second  RSS“  is  the  residual  sum  of  squares  for 
an  approximate  model  function 

•»  a  a  A 

n*  <a,0,x)  »  al  ♦  0{H(k)  +  (x-k)  F(k)}. 

The  function  extends  to  the  whole  real  line  the  planar  function  which 

A  A  A 

defines  n(a,0,x)  in  the  integer  interval  [k,k+1J  containing  x.  For  any 

integer  k,  the  approximation  RSS^  coincides  with  the  original  function  RSS 

for  values  of  x  in  [k,k+1].  The  curve  RSS*  shown  coincides  with  RSS  in 

k 

the  interval  containing  the  minimum.  The  third  curve  shown  is  the  quadratic 

approximation  RSSQ  based  on  a  linearisation  of  the  model  function  n- 

The  discrepancy  between  the  original  RSS(x)  and  the  extension  RSS*(x) 

k 

stems  from  the  departure  of  the  solution  locus  from  the  plane  spanned  by  1, 

A  A 

H(k),  and  F(k).  Thus,  the  divergence  of  RSS  from  RSS*  is  an  indicator  of 

k 

intrinsic  nonlinearity.  The  discrepancy  between  RSS*  based  on  the  planar 

k 

extension  and  RSSQ  based  on  the  linear  approximation  to  the  model  function 
reflects  parameter-effects  nonlinearity.  Both  types  of  nonlinearity  appear 
from  Figure  2  to  be  slight. 


Figure  2 : 

Residual  Sum  of  Squares  Function  RSS,  Extension 
RSS-,  and  Quadratic  Approximation  RSSQ 

(^Th/cu-d) 2) 

T  TF) 


See  caption  to  Figure  1 


Another  way  to  see  nonlinearity  is  to  examine  two-dimensional  projections 
of  the  sum-of-squares  regions  defined  by  Equation  (5.3).  If  the  boundary  of 
such  a  projection  is  not  elliptical,  this  is  evidence  of  strong  nonlinearity. 
Figure  3  shows  several  such  regions  for  parameters  of  the  energy  model,  for 
varying  values  of  F*  for  the  data  set  displayed  in  the  first  two 
figures.  For  small  values  of  F*  a,  corresponding  to  confidence  levels  of 
0.99  or  less  (it  >  0.01)  the  regions  shown  in  Figure  3  all  look  fairly 
elliptical.  Only  at  rather  high  confidence  levels,  which  are  of  little 
practical  interest,  do  the  contours  become  appreciably  distorted  from  the 
elliptical  ideal . 

By  itself,  unfortunately,  the  shape  of  the  sum-of-squares  regions  gives 
only  limited  information  about  the  nature  of  the  nonlinearity.  If  these 
regions  are  not  elliptical,  the  Gaussian  approximation  (Equation  (5.1))  is 
clearly  inadequate  to  give  confidence  regions.  At  the  same  time,  interpreting 
the  sum-of-squares  regions  themselves  as  confidence  regions  (of  the  indicated 
confidence  level)  may  or  may  not  be  valid.  The  reason  for  this  ambiguity  is 
that  the  distortion  from  the  ellipse  may  reflect  the  shape  of  the  solution 
locus  L  itself,  (indicating  strong  intrinsic  nonlinearity),  or  might  simply 
result  from  a  nonlinear  mapping  between  L  and  the  parameter  space  0 
(parameter-effects  nonlinearity). 

Conversely,  certain  types  of  intrinsic  and  parameter-effects 
nonlinearities  will  still  yield  elliptical  contours.  Thus,  the  breakdown  of 
either  approximation  (5.1)  or  (5.3)  may  not  be  manifest  in  simple  examination 
of  regions  such  as  those  drawn  in  Figure  3. 

Somewhat  more  informative  is  a  comparison  of  the  (projected)  sum-of- 

squares  regions  defined  by  Equation  (5.3)  with  the  elliptical  regions  defined 

by  Equation  (5.1),  for  various  values  of  ir.  Here  again,  though,  the 
implications  of  the  visual  comparison  are  ambiguous.  A  particular  effect  may 


Squares  Contours  for  Several  Confidence  Levels 
Parameters  of  the  Energy  Model 


resalt  either  from  the  falling  off  of  the  solution  locus  from  the  tangent 
plane  (intrinsic  nonlinearity)  or  fron  the  distortion  of  coordinate  lines 
within  the  plane  (parameter-effect  nonlinearity). 

What  such  a  comparison  does  reveal  is  how  close  the  normal-theory  regions 

come  to  the  a uo-of- squares  regions,  which  in  general  are  more  reliable.  The 

two  sets  of  confidence  regions  may  be  compared  more  conveniently  by  plotting 

for  each  parameter  component  the  one-dimensional  projections  of  the  two 

regions  onto  the  coordinate  axis,  as  a  function  of  Jv*  .  A  set  of  such 

1,n-p 

plots  for  the  parameters  of  the  energy  model  is  shown  in  Figure  4,  for  our 
example  data  set. 

Consistent  with  the  indication  from  the  previous  figure.  Figure  4  shows 
that  for  confidence  levels  of  practical  interest,  say  1-w  <  0.999,  the 
Gaussian-based  confidence  intervals  (indicated  by  'N'  for  Normal)  are  in 
good  agreement  with  those  based  on  the  eum-of-squares  methods  for  all  three 
parameters  a,  &,  and  t  of  the  basic  model.  For  the  important  index  T, 
the  two  sets  of  confidence  intervals  are  in  virtually  perfect  agreement  even 
for  extremely  high  confidence  levels.  Thus,  provided  the  su»-of- squares 
method  gives  accurate  confidence  intervals  for  this  data  set,  the  Gaussian 
approximation  also  appears  to  be  trustworthy. 

The  visual  indicators  just  described  are  unsatisfying  in  two  major 
respects.  First,  they  are  only  qualitative,  giving  no  firm  basis  for 
determining  whether  the  Gaussian  or  sum-of-squares  regions  are  justified  as 
confidence  regions.  Secondly,  they  require  evaluation  of  sum-of-squares 
contours.  In  many  cases,  a  justification  for  the  normal  approximation  is 
sought  precisely  because  evaluation  of  sum-of-squares  contours  is  difficult  or 


He  address  these  difficulties  in  the  remainder  of  this  paper.  First,  in 
Section  7  we  introduce  two  simple  quantitative  measures  of  intrinsic 
nonlinearity  which  are  particularly  suitable  for  nondif ferentiable  models.  We 
then  relate  these  measures  to  "effective  curvatures"  for  segmented  models,  in 
Section  8,  and  apply  the  effective  curvatures  to  the  energy  model  in  Section 
9.  Finally,  in  Section  10,  we  suggest  an  alternate  approach,  which  yields 
effective  parameter-effects  as  well  as  intrinsic  curvatures. 

7.  Quantifying  Nonlinearity 


As  noted  previously,  the  small- sample  validity  of  the  approximate 
confidence  intervals  determined  by  Equation  (5.3)  depends  on  how  nearly  planar 
the  solution  locus  L  is  over  the  reqion  of  interest.  For  the  continuously 
differentiable  model,  the  intrinsic  curvature  measures  the  departure  from 

the  tangent  plane  in  terms  of  the  rate  of  change,  normal  to  that  plane,  of  a 
tangent  vector  t^.  For  both  segmented  and  smoother  models,  this  departure  can 
be  measured  in  other  ways,  two  of  which  are  considered  in  this  section. 

A  direct  measure  of  the  departure  from  the  plane  is  the  distance  A 

A 

between  the  tangent  plane  at  h(0)  and  a  point  n(0)  at  the  edge  of  the 

region  of  interest  -  that  is,  for  0  lying  on  a  sum-of-squares  contour  as 

defined  by  Equation  (5.3).  The  gap  is  easily  evaluated  at  a  point  n(0)  as 

the  square  root  of  the  residual  sum  of  squares  from  a  regression  of  the  secant 

*  .  * 

n(6)  -  n(0)  on  the  derivative  matrix  n(6),  which  defines  the  tangent  plane 

A 

at  n(0).  The  solution  locus  may  be  considered  "nearly  planar",  and  the  sum- 
of-squares  region  an  adequate  approximate  confidence  interval,  if  the  gap  A 
is  small  compared  to  the  radius  of  the  sphere  defining  the  region.  This 
radius,  as  given  by  Equation  (5.3),  is  X"l+f )RSS(8),  where 


f  ■  qF  /(n-p) •  Alternatively,  following  Bates  and  Matts  (1980),  we  may 
<1  /  n-P  . 


simply  compare  the  gap  A  with  /fRSS(0),  the  radius  of  the  sphere's 

A 

intersection  with  the  tangent  plane  at  0(6) • 

For  each  of  75  aggregate  data  sets  fit  to  Equation  (2.1),  the  maximum  gap 
A  was  evaluated  on  the  "one-standard-error"  contour  defined  by  f  -  1/(n-p) 


(i.e.,  q  »  1,  F  *  1),  for  which  /f  RSS(0)  -  s.  The  procedure  used  to 

q,n-p 

compute  A  is  described  below  in  Section  7.  The  ratio  of  A  to 
max  max 

s  *  /llSS(0)/(n-3 )  ranged  from  0.02  to  0.25,  with  a  median  of  0.08. 

Thus,  along  this  contour,  the  greatest  departure  of  the  solution  locus  from 
the  approximating  tangent  plane  was  typically  less  than  10%  of  the  distance 

A 

from  a  point  on  the  contour  to  n(0),  and  at  worst  was  25%  of  this 

distance.  On  this  basis,  the  planar  approxima-  tion  appears  to  be  reasonable 

for  most  data  sets  arising  for  the  energy  model. 

The  gap  A  can  be  used  as  a  measure  of  intrinsic  nonlinearity  for  either 

a  segmented  or  a  smooth  model.  Rote  also  that  the  gap  indicates  the  total 

intrinsic  nonlinearity  in  a  particular  direction,  whether  caused  by  a 

continuous  or  a  discontinuous  change  in  the  derivative  n. 

A  second  measure,  which  reflects  the  effect  of  nondifferentiability 

alone,  is  the  angle  t  between  the  two  limiting  tangent  planes  at  a  point  of 

discontinuity  of  n.  In  the  case  of  a  piecewise  planar  model  such  as  that 

given  by  Equation  (2.1),  $  is  simply  the  angle  between  planar  segments.  For 

the  general  model  with  discontinuous  derivative  in  x,  we  denote  by  U_  and 

dh 

U+,  respectively,  the  left  and  right-hand  limits  of  xpj;  at  a  point  of 
discontinuity,  and  by  V  the  matrix  of  derivatives  of  n  with  respect  to  the 
other  parameters  at  that  point.  Then  the  angle  at  the  discontinuity  is  given 


cos  (♦)  ■ 


(0_1V) ' (U+1V) 
|u  1V|  lo.lvl 


r 


where  the  notation  IV  denotes  the  component  orthogonal  to  V. 

For  the  same  75  aggregate  data  sets  fit  by  Equation  (2.1),  the  angle  $ 

A  A  A 

evaluated  at  the  integers  k  and  k+ 1  bracketing  the  estimate  t  ranged 
from  0.02  to  0.13  radians,  with  a  median  of  0.06  radians.  These  small  angles 
again  indicate  that  the  intrinsic  nonlinearity  is  slight  for  this  model. 
However,  the  impact  of  the  bend  in  the  solution  locus  depends  not  just  on  the 
magnitude  of  a  single  bend,  but  also  on  how  many  bends  there  are  in  a  region 
of  interest. 

Certainly  other  direct  measures  of  nonlinearity  could  be  considered  for 
segmented  models.  The  appeal  of  the  two  proposed  here  will  emerge  as  we 
proceed. 

8.  Effective  Curvature  Measures  for  Segmented  Models 

The  measures  described  in  the  previous  section  allow  us  to  associate 
numbers  with  nonlinearity,  but  still  leave  us  with  the  question  of  what  the 
numbers  mean.  How  small  must  the  gap  A  or  angle  $  be  for  the  intrinsic 
nonlinearity  to  be  considered  negligible?  As  noted  above,  the  angles  $ ,  and 
the  spacing  between  points  of  discontinuity  n  together  indicate  the  severity 
of  intrinsic  nondifferentiablity.  The  present  author  (Goldberg,  1982)  has 
related  the  angles  and  spacing  to  the  shape  of  the  observed  likelihood 
function,  and  to  the  performance  of  fitting  procedures.  For  purposes  of 
inference,  however,  we  are  concerned  with  the  total  intrinsic  nonlinearity. 
Hence,  we  focus  now  on  the  gap  A,  which  incorporates  both  instantaneous  and 
continuous  changes  in  the  derivative  h« 

By  considering  the  relation  of  the  gap  A  to  the  intrinsic  curvature,  as 


defined  by  Equation  (3.7),  for  smooth  models,  we  will  obtain  an  expression  for 
the  effective  curvature  of  segmented  models.  Effective  curvatures  make  it 


possible  to  think  of  such  models  in  the  same  terms  as  the  more  familiar  smooth 


models. 

In  the  smooth  case,  we  can  approximate  the  geodesic  curve  from  n(8) 

A  A 

to  n(8)  by  a  parabola  centered  at  h(8),  as  illustrated  in  Figure  5. 

2 

A  parabola  defined  by  y2  =  cy ^  has  curvature  at  y^  *  0  given  by 

lal  _  I [0,2c] 'I 

|t|2  |[1,0]'|2 

-  2c 


The  curvature  at  the  center  can  therefore  be  determined  from  any  point 
(yi ,y2>  of  the  parabola. 

For  the  parabola  which  ideally  represents  a  cross-section  of  the  solution 
locus,  y1  and  y2  correspond  respectively  to  the  tangential  and  normal 

A 

components  of  the  secant  n(8)  -  n(8).  Thus,  denoting  by  P  the  projection 

•  *  * 
matrix  onto  n(8),  and  by  5(8)  the  secant  n(8)  —  n<8),  we  have 

k1  =.  2 

|PC(8) |2 


(8.1) 


Hence 


„  2A 
I  PC (8)  1 2 

i  -  21C2  A 

|FC(8 ) 1 2 


(8.2) 


To  complete  the  connection  between  intrinsic  curvature  and  the 

direct  measure  of  intrinsic  nonlinearity  A,  it  is  necessary  to  specify  the 

direction  v  associated  with  as  given  by  Equation  (3.7).  This  direction 

•  * 

is  simply  the  coordinate  with  respect  to  n(8)  of  the  projection  onto  the 
tangent  plane  of  the  secant  C  <  8 ) . 


Thus,  for  a  given  point  n(0),  the  squared  gap  A  at  n(0)  is  the 

•  * 

residual  sum  of  squares  from  a  regression  of  C(0)  on  n(0),  while  the 

direction  v  is  given  by  the  coefficients  of  this  regression.  Further,  the 

multiple  correlation  for  the  regression  is  the  cosine  of  the  angle  u>  between 

•  * 

the  secant  C(0)  and  the  tangent  plane  defined  by  n(0).  For  a  piecewise 
planar  model,  such  as  the  energy  model,  the  angle  $  defined  above  can  be 

A 

related  to  this  secant  angle  <•>.  Specifically,  whenever  n(0)  and  n(0)  lie 
on  adjacent  planar  segments,  the  angle  $  represents  an  upper  bound  on  the 
secant  angle  u>  in  the  direction  v. 

It  is  important  to  note  that  the  direction  v,  which  indicates  the  line 
in  the  tangent  plane  pointing  toward  h<0),  will  not  in  general  coincide  with 

A 

0-0.  The  reason  is  that  the  sample-space  image  of  the  parameter-space 

A 

segment  00  is  in  general  a  curve,  not  a  straight  line.  Thus,  the  vector 

#  A  A  A 

t0_e  “  n<0-0),  which  is  the  tangent  at  0  to  the  curved  image  of  00,  does 
not  point  toward  n(0). 

A 

A  more  precise  relationship  between  v  and  0-0  is  determined  by 

A 

expanding  n  ( 0 )  about  0 .  We  have 

C<0>  r  n<0-0)  +  (  aA  )[<0-0>’n*<0-0>]  , 

m 

so  that 

v  =  0-0  +  (  \  )<n,T»)”1n,aJ-*  ,  (8.3) 

with  a!  «  the  tangential  component  of  the  acceleration  as  defined  by 

A 

Equation  (3.6).  The  difference  between  v  and  0-0  is  thus  closely  related 
to  the  parameter-effects  curvatures.  Recall  that  the  gap  A  and  the  secant 
angle  o>  themselves  measure  intrinsic  nonlinearity  only. 

For  a  smooth  model.  Equation  (8.1)  or  (8.2)  cam  be  regarded  as  an 
approximation  to  the  actual  curvature.  For  a  segmented  model,  we  will  take 
these  equations  as  the  definitions  of  effective  curvatures  K  and  y.  In  the 


latter  case,  the  effective  curvatures  so  obtained  will  depend  strongly  on  the 

A 

size  of  the  regions  of  interest  and  on  the  proximity  of  6  and  8  to  a 
discontinuity  point  of  n .  One  may  question  the  value  of  a  curvature  measure 
which  is  so  sensitive  to  the  points  chosen  for  its  evaluation.  In  fact, 
however,  such  a  dependence  is  entirely  appropriate  for  nondifferentiable 
models. 

Nondifferentiability  means  that  a  single  number  indicating  a  local  rate 
of  change  (i.e.,  a  derivative  or  curvature)  does  not  adequately  represent  more 
global  behavior.  Describing  nonlinearity  in  terms  of  curvature  amounts  to 
approximating  the  solution  locus  L  by  a  spherical  or  parabolic  surface, 
which  coincides  with  I.  at  the  point  of  the  fit.  For  a  smooth  model,  the 
same  approximation  is  valid  over  a  wide  range,  essentially  until  the  second- 
order  expansion  of  the  model  function  n  breaks  down.  For  the  segmented 
model,  on  the  other  hand,  a  different  smooth  approximation  is  relevant 
depending  on  the  width  of  the  region  of  interest.  For  inferences  in  a  close 
neighborhood  of  a  discontinuity  point  8q,  it  is  wise  to  consider  a  surface 
of  small  radius  of  curvature,  which  approximates  L  well  in  that 
neighborhood.  For  inferences  over  a  wider  region,  a  sphere  of  larger  radius, 
which  might  be  relatively  far  from  L  in  the  immediate  vicinity  of  8q, 
would  be  more  appropriate. 

To  apply  Beale's  formula  (Equation  (5.4)),  the  root  mean  square  intrinsic 

relative  curvature  is  required,  while  to  use  the  methods  of  Bates  and 

rms 

Watts  (1981),  Hamilton,  Watts  and  Bates  (1982),  or  Box  (1971)  requires  the 

•  • 

entire  acceleration  array  [n  ] .  Thus ,  the  approximation  given  by  Equation 

m 

(8.2),  which  provides  estimates  of  the  relative  intrinsic  curvature  in  a 
particular  direction,  still  leaves  much  work  to  be  done  if  the  procedures 
which  make  the  concept  of  curvature  so  appealing  in  general  applications  are 
to  be  used. 
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In  many  cases,  however,  nonlinearities  are  slight,  so  that  the 


corrections  offered  by  these  procedures  are  negligible.  In  particular,  the 
experience  of  Bates  and  Watts  (1980)  indicates  that  the  relative  intrinsic 
nonlinearity  of  most  models  is  quite  small.  Thus,  a  quick  method  of 
establishing  that  the  maximum  relative  intrinsic  curvature  is  sufficiently 
small  could  frequently  obviate  the  need  for  more  complicated  computations. 
This  is  the  approach  taken  below  in  applying  effective  curvature  measures  to 
the  energy  model. 


9.  Effective  Curvature  of  the  Energy  Model 

Above,  we  have  seen  several  indications  that  the  nonlinearities  in  the 
energy  model  are  slight:  the  small  discrepancies  among  the  RSS  functions  in 
Figure  2,  the  close  correspondence  between  the  Gaussian  and  sum-of-squares 
confidence  intervals  in  Figure  4,  the  mild  tangent  angles  t,  end  the  small 
ratios  A  /s  of  the  gap  to  the  radius  of  a  sum-o f -squares  region.  Hence, 

SAX 

to  determine  that  the  Gaussian  approximation  is  adequate,  it  should  be 

sufficient  in  most  cases  to  verify  that  the  maximum  curvatures  are  small.  Ii 

this  section,  we  consider  only  the  intrinsic  curvature  Y  • 

Appendix  A  describes  how  the  maximum  effective  intrinsic  curvature  Y^m. 

can  be  found  for  the  energy  model,  on  a  sum-of-squares  contour  defined  by 

Equation  (5.3).  Using  the  approximation  given  by  Equation  (A.1)  for  the 

case  F*  “1,  Equation  (8.2)  yields 
q,n-p 


Y 


1 

max 


2/p  A 


max 


(9.1 


The  results  in  Section  7  on  the  ratio  A  /s  of  the  maximum  gap  to  the 

max 

radius  of  a  one-standard-error  sum-of-squares  region  can  now  be  translated 
into  statements  about  effective  curvatures  for  the  energy  model.  For  the  75 


data  sets,  the  maximum  (effective)  relative  intrinsic  curvature  y  ranges 

BAX 

from  0.07  to  0.85,  with  a  median  of  0.29. 

When  the  second-derivative  array  In]  has  only  one  non-zero  vector,  on 

the  diagonal,  it  is  possible  to  show  that  the  root  mean  square  curvature 

Y  and  the  maximum  curvature  Y  are  related  by 
rms  max 


rms  'p(prt2)  ^max 

For  the  energy  model,  with  n  given  by  (3.4),  the  normal  component  of  [n] 


(9.2) 


2  2 

has  a  single  non-zero  vector,  the  orthogonal  component  of  3  n/3t  *  B3F/3t. 


It  is  therefore  possible  to  evaluate  Beale's  inflation  factor  u,  given  by 


(5.4), knowing  the  effective  intrinsic  curvature  Y«..„  only  in  the  direction 
of  maximum  curvature. 


For  the  worst  case  then  (y  ”  0.85),  Equation  (9.2)  yields  an 

BAX 


inflation  factor  w  ■  1.08,  while  for  the  median  value  (Y  ”0.29)  we  get 

max 


II  ■  1.01.  Thus,  if  Beale's  formula  holds  approximately  in  the  non- 
differentiable  case,  with  the  effective  relative  intrinsic  curvature  defined 
by  (8.2),  then  the  correction  required  to  make  the  sumr-of-squares  regions 
(5.3)  conservative  is  minimal  in  most  cases.  In  fact,  the  factor  v  is 
greater  than  1.03  for  only  two  of  the  75  cases  studied. 


10.  Smoothing  the  Model  Function 

The  effective  curvatures  defined  for  segmented  models  by  Equations  (8.1) 
and  (8.2)  are  based  implicitly  on  an  approximation  to  the  solution  locus  L 
by  some  smooth  surface.  Another  approach  is  to  approximate  the  model  function 
n  explicitly  over  the  region  of  interest  by  a  smooth  function  ri,  then 

m 

consider  the  curvature  of  n.  Obviously,  this  procedure  provides  both 


parameter-effect  and  intrinsic  curvatures 


The  approximation  method  must  be  left  as  an  ad  hoc  procedure  to  be  chosen 
for  the  particular  model  studied/  and  in  general  will  involve  considerably 
more  computation  than  the  measures  suggested  above.  On  the  other  hand,  the 
curvature  of  a  close  smooth  approximation  is  arguably  the  best  definition  of 
curvxture  for  a  segmented  model.  Furthermore/  if  the  same  approximation  n 
applies  over  a  wide  range  of  values  of  8/  then  the  curvature  array  may  neecl 
to  be  evaluated  only  once  for  all  confidence  levels  of  interest. 

In  the  case  of  the  energy  model/  a  very  close  approximation  to  the  model 
function  was  obtained  for  each  data  set  by  smoothing  the  nondifferentiable 
degree-day  variable  H(t).  For  each  month  m,  a  smooth  function  H  (t)  was 

B 

obtained  by  fitting  a  quadratic  function 

A  A  A  « 

H  (X  )  -  H  (T)  +  b  (T  -X)  +  C  (x  -x)  +  e  .  (10.1) 

n  i  in  n  i  b  x  m 

The  coefficients  b  and  c  in  Equation  (10.1)  were  found  by  the  method  of  least 

in  in 

A 

squares,  using  values  x^  ■  x  +  k,  k  *  -5,-4, ... ,4,5. 

Figure  6  shows  the  actual  and  smoothed  degree-days  for  each  month  m, 

A 

from  August  1969  to  July  1970,  obtained  by  this  procedure  with  x  set  equal 

to  65°F.  The  figure  shows  that  the  approximation  H  is  quite  close  to  the 

m 

actual  Hm,  not  only  over  the  range  of  the  fit,  but  also  considerably 

beyond.  Table  1  shows  the  results  of  the  regressions  for  the  twelve  months. 

2 

The  R  values  are  quite  high  in  all  cases.  In  addition,  the  coefficient 

A 

b_  is  generally  very  close  to  the  derivative  F  (x),  so  that  the  derivative 
m  m 

* 

^  A 

H  (x)  is  also  close  to  the  derivative  of  the  original  function.  In  all  data 
m 

sets  studied,  the  least  squares  estimates  8  found  by  using  the  approximation 

A 

M  A 

n(a,8,x)  *  al  +  $H(x)  were  also  quite  close  to  the  original  estimates  8 

A 

corresponding  to  the  true  model;  the  differences  ®j“®j  were  found  to  be  on 

A 

the  order  of  10%  of  the  standard  errors  of  6  . 


I 


•Range  of  Pit - * 


t  (°P) 


Actual  degree-days  are  indicated  by  ,  the  approximations 
by  the  continuous  curves.  The  fits  are  for  aggregate  degree 
days,  defined  by  Equation  (2.3),  August  1969  -  July  1970. 


Coefficients  from  Fitting  Equation 


Figure  7  shove  a  plot  of  the  max  la  urn  effective  relative  intrinsic 

curvatures  Y^^/  computed  from  the  gap  A  using  Equation  (9.1),  versus  the 

relative  intrinsic  curvatures  Y  for  the  smoothed  model  function  n.  The 

max 

figure  shows  a  weak,  positive  relation  between  the  two  curvature  measures  over 

the  75  data  sets.  However,  the  formal  curvature  Y^  for  the  smoothed  model 

max 

is  almost  always  larger  than  the  effective  curvature  Y^  derived  from  the 

max 

gap  A  .  Evidently,  then,  for  9  at  a  distance  of  one  standard  error  from 
max 

A 

0,  (where  the  gap  A  was  evaluated)  the  original  function  n  tends  to  be 

»  ~ 

closer  to  the  tangent  plane  at  h(0)  than  is  the  approximation  n. 

i 

The  disparity  between  the  two  measures  Y _  and  y _  does  seem  to 

max  max 

A 

depend  on  the  standard  error  of  T,  which  determines  how  many  bends  in  the 


solution  locus  occur  between  6  and  the  point  0  where  the  gap  A  was 

max 

evaluated.  In  general,  the  larger  disparities  are  associated  with  larger 

standard  errors  (around  3°F),  while  for  the  data  sets  for  which  Y^  and 

max 

Y  ere  roughly  equal  the  standard  error  of  T  is  relatively  small  (less 


than  or  equal  to  1°F) . 


The  curvatures  Y  computed  for  the  smooth  model  n  not  only  tend  to 


be  larger  than  the  effective  curvatures  Y  based  on  the  gap,  but  are  also 

max 

more  spread  out.  The  gap-based  effective  curvature  Y^  is  derived  from  a 

max 

single  point  n(6),  where  the  value  of  t  corresponding  to  that  point  is 

A 

anywhere  from  one  to  three  (or  in  one  case  seven)  degrees  from  t,  and  each 
degree  represents  a  point  of  discontinuity  of  n.  It  is  therefore  somewhat 
surprising  that  smoothing  n  over  ten  integer  values  of  t  yields  a  measure 


Y  which  is  more  erratically  behaved  than  that  based  on  the  gap.  Whatever 


the  reason  for  this  behavior,  the  relatively  large  values  of  Y  found  for 

max 


a  few  data  sets  serve 


rning  that  at  some  confidence  levels  the  impact 


of  intrinsic  nonlinearity  may  be  greater  than  is  indicated  by  the  gap 
evaluated  on  a  one-standard-error  sum-of-squares  contour. 


-ii 


As  discussed  above,  it  may  not  make  sense  to  try  to  describe  the  shape  of 
a  piecewise  differentiable  surface  in  terms  of  the  curvature  of  a  single 
quadratic  approximation.  Even  though  the  approximation  n  is  quite  close  to 
the  original  function  h,  the  correspondence  between  the  two  functions  must 

A 

vary  with  distance,  as  well  as  direction,  from  n(6).  Whether  the 

approximation  n,  and  its  curvatures,  can  be  considered  adequate  to  describe 

the  behavior  of  n  depends,  of  course,  on  the  degree  of  precision  required. 

~1  i 

the  ratio  Y  /Y  of  the  two  measures  is  less  than  two  for  most  of  the 
max  max 

data  sets,  and  is  greater  than  four  for  only  two. 

The  approximation  n  also  offers  a  measure  of  parameter-effects 

-I 

curvature  Y  •  Ohfortunately,  there  is  not  simple  way  to  determine  the 

~l 

direction  v  in  which  Yy  is  maximised.  However,  as  explained  in  Appendix 

»*| 

A,  a  good  indication  of  the  strength  of  parameter  effects  is  given  by  Yy  for 
v  -  [-0F,O,1] ',  corresponding  to  t  •  $F  1  1. 

<v| 

For  the  75  data  sets  studied  here,  Yy  for  this  direction  ranged  from 
0.1  to  1.4,  with  a  median  of  0.2.  The  small  median  value  indicates  that 
parameter-effects  nonlinearities  are  slight  in  most  cases.  For  the  data  set 
used  as  an  example  throughout  this  paper,  Yy  “  0.46,  which  is  the  85 
percentile  of  the  75  observed  values.  Thus,  for  most  data  sets,  the 
parameter-effects  nonlinearities  appear  to  be  smaller  than  was  seen  for  the 
example  data  set.  As  a  result,  the  Gaussian  approximation  may  typically  be 
expected  to  perform  as  well  or  better  than  is  indicated  in  Figure  4  over  that 
range  of  confidence  levels. 

11.  Conclusion 

We  have  presented  several  methods  for  examining  nonlinearity  in  awkward 
models.  Although  the  primary  focus  has  been  a  segmented  model,  the  visual 
indicators  and  the  curvature  measures  proposed  may  also  be  used  for 


continuously  differentiable  models.  The  visual  indicators  can  reveal  a  great 
deal  about  the  behavior  of  the  model  function,  but  computation  of  the  required 
quantities  may  be  quite  cumbersome.  By  constrast,  the  quantitative  measures 
proposed  may  be  easier  to  compute  them  formal  curvatures  based  on  second 
derivatives.  In  addition,  in  cases  where  the  second-order  approximation  does 
not  hold  over  an  entire  region  of  interest,  an  effective  curvature  based  on 
points  at  the  edge  as  well  as  the  interior  of  that  region  may  be  more 
meaningful  than  the  formal  curvature  evaluated  at  the  point  of  the  fit. 

For  the  energy  model  which  motivated  this  study,  all  the  measures 
explored  indicate  that  the  nonlinearities  are  generally  small  for  data  sets 
like  those  examined  in  this  work.  The  intrinsic  nonlinearity,  as  measured  by 
the  tangent  angle  $  and  by  approximate  curvatures,  is  small  enough  that  the 
sum-of-squares  method  gives  good  approximate  confidence  regions.  A 
combination  of  direct  comparisons  of  Gaussian  and  sum-of-squares  regions  (for 

-I 

a  particular  data  set)  and  examination  of  parameter-effect  curvatures  Y 
(for  a  large  number  of  sets)  leads  to  the  conclusion  that  the  more  easily 
computed  Gaussian  approximation  should  be  acceptable  in  most  cases. 

Further  study  is  needed  to  assess  the  performance  of  the  effective 
curvature  measures  proposed  here,  for  a  variety  of  segmented  and  smooth 
models.  In  this  context,  both  the  validity  of  the  approximations  and  the 
degree  to  which  these  methods  actually  facilitate  computations  are 
important.  Also  useful  would  be  efficient  means  of  finding  the  maximum 
intrinsic  or  parameter-effects  curvature,  on  the  basis  of  which  more  detailed 
computations  might  be  forgone.  A  paper  currently  in  preparation  describes 
procedures  for  obtaining  mean  square  effective  curvatures,  both  intrinsic  and 
parameter-effect,  based  on  methods  developed  here,  with  an  emphasis  on 
applications  to  smooth  models. 


•«  *v 


Appendix  A.  Finding  Maximum  Curvatures  for  the  Energy  Model 

For  the  energy  model  defined  by  (2.1),  the  maximum  intrinsic  curvature  at 

A  A  A  A  A  A 

9  =  [o,8,t] '  is  in  the  direction  of  F(t)  1  [1,H(x)J,  the  component  of 
*  - 
F(x)  orthogonal  to  the  vectors  1  and  H(t).  The  formal  curvature  Y 

max 

was  therefore  evaluated  in  the  direction  of  F(T)  i  [1,h(t>],  using  the 


estimate  t  and  the  second  derivative  n  for  the  smoothed  model  function 

n.  Finding  the  effective  curvature  in  the  corresponding  direction  for 

max 

the  original  model  is  more  complicated;  as  noted  in  the  text,  evaluating  the 

gap  A  at  n(9+v)  does  not  in  general  yield  the  effective  curvature  in 

the  desired  direction  v.  Rather  than  searching  for  points  on  the  sum-of- 

squares  contour  in  the  indicated  direction,  a  more  ad  hoc  procedure  was  used 

to  find  A 

max 

To  find  the  maximum  gap  A  around  the  sum- of -squares  contour,  it  if 

necessary  to  maximize  the  residual  sum-of-squares  from  the  regression  of 
*  •  * 

n(8)  -  n<9)  on  n(0).  For  the  energy  model,  this  is  a  regression  of 

A  A  A  AAA  A  A  A 

(a-a)1  +  Bh(t)-Bh(t)  on  [ 1,H(x ) ,Bf(t ) ] .  The  terms  (a-a)1  and  BH(x) 
leave  no  residual,  while  £  is  a  scalar.  Hence,  maximizing  the  residual  from 

A  A  ^ 

a  regression  of  H(t)  on  [1,H(t),  Bf(t)J,  then  multiplying  by  B  ,  the 
maximum  of  B  along  the  contour,  yields  an  overestimate  of  the  maximum  gap 

A  A 

A  .  The  maximal  divergence  of  H(t)  from  the  H(t)  -  BF(t)  plane  occurs 
IU&X 

A 

at  values  of  t  farthest  from  T.  Thus,  the  maximal  gap  was  found  by 
finding  the  extreme  values  x  and  x+  along  the  contour,  obtaining  the 

«  +  A  A 

residuals  from  regressions  of  H(T  )  and  H(x  )  on  [1,H(x),  BF(x)],  then 
multiplying  the  larger  magnitude  residual  by  B+. 

Having  obtained  (overestimates  of)  the  maximum  gap,  we  still  need  the 


tangential  secant  component  |P?(6)|  to  derive  effective  curvatures  from 
Equation  (8.1)  or  (8.2),  According  to  the  linear  approximation, 


| PC (0)1  -  it  RSS(8)  tot  8  on  a  contour  defined  by  (5.3),  with 


K 

f  ■  P  /(n-p).  Making  this  approximation,  Equation  (8.2)  becomes 
q,n-p 


The  maximum  parameter-effects  curvature  is  also  of  interest.  For  the 

—  I 

parameter  effects,  only  the  formal  curvature  Y  of  the  smoothed  model  is 

available.  As  noted  in  the  text,  there  is  no  simple  way  to  determine  the 

-I 

direction  v  in  which  Yy  is  maximized.  He  do  know  that  the  tangent 

vector  tv  must  be  orthogonal  to  dh/3a  -  1,  since  the  model  function  has 

zero  curvature  in  the  a  direction.  In  addition,  it  is  clear  from  Equation 

(4.2)  that  the  T-component  of  the  maximizing  v  (hence  the  8F-component 

of  ty)  must  be  non-zero.  A  reasonable  measure  of  the  strength  of  parameter 

-I  — 

effects  is  therefore  offered  by  Yy  for  v  -  [-8F,0,1]',  corresponding  to 
ty  ■  $F  1  1. 


k 


w  2 

F  S 

q,n-p 


( A.  1 ) 
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